A hierarchical clusterer ensemble method based on boosting theory
نویسندگان
چکیده
Bagging and boosting are two successful well-known methods for developing classifier ensembles. It is recognized that the clusterer ensemble methods which utilize the boosting concept, can create clusterings with quality and robustness improvement. In this paper, we introduce a new boosting based hierarchical clusterer ensemble method called Bob-Hic. This method is utilized to create a consensus hierarchical clustering (h-clustering) on a dataset which is helpful to improve the clustering accuracy. Bob-Hic includes several boosting iterations, and in each iteration, first a weighted random sampling is performed on the original dataset and then an individual h-clustering is created on the selected samples. At the end of the iterations, the individual clusterings are combined to a final consensus hclustering. The middle structures used in the combination are distance descriptor matrices which correspond to individual h-clustering results. This final integration is done through an information theoretic approach. Experiments on both synthetic and real popular datasets confirm that the proposed method improves the results of simple clustering algorithms. This method provides better consensus clustering quality in comparison to other available ensemble techniques.
منابع مشابه
Diversity-Based Boosting Algorithm
Boosting is a well known and efficient technique for constructing a classifier ensemble. An ensemble is built incrementally by altering the distribution of training data set and forcing learners to focus on misclassification errors. In this paper, an improvement to Boosting algorithm called DivBoosting algorithm is proposed and studied. Experiments on several data sets are conducted on both Boo...
متن کاملHigh-Dimensional Unsupervised Active Learning Method
In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...
متن کاملA Hybrid Framework for Building an Efficient Incremental Intrusion Detection System
In this paper, a boosting-based incremental hybrid intrusion detection system is introduced. This system combines incremental misuse detection and incremental anomaly detection. We use boosting ensemble of weak classifiers to implement misuse intrusion detection system. It can identify new classes types of intrusions that do not exist in the training dataset for incremental misuse detection. As...
متن کاملNetwork Game and Boosting
We propose an ensemble learning method called Network Boosting which combines weak learners together based on a random graph (network). A theoretic analysis based on the game theory shows that the algorithm can learn the target hypothesis asymptotically. The comparison results using several datasets of the UCI machine learning repository and synthetic data are promising and show that Network Bo...
متن کاملBoosting-Based System Combination for Machine Translation
In this paper, we present a simple and effective method to address the issue of how to generate diversified translation systems from a single Statistical Machine Translation (SMT) engine for system combination. Our method is based on the framework of boosting. First, a sequence of weak translation systems is generated from a baseline system in an iterative manner. Then, a strong translation sys...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Knowl.-Based Syst.
دوره 45 شماره
صفحات -
تاریخ انتشار 2013